Data Mining with Extended Symbolic Models

نویسندگان

  • C. Apte
  • E. Pednault
چکیده

Symbolic modeling of data with decision trees or decision rules has a certain appeal to data mining application developers. The computationally e cient nature of the modeling methodology, and the inbuilt explanatory nature of the models that are generated, are two often cited reasons for the preferred use of these methods. Traditionally, the applications of these methods had been restricted to classi cation modeling. Recent extensions to these methods employing ideas from statistics and machine learning have resulted in more general frameworks that continue to exhibit the underlying characteristics but apply to a much wider class of applications. These extended symbolic modeling methodologies permit exciting new application avenues, including probabilistic modeling, text mining, and integrating data mining into knowledge-based frameworks. Highlights of work in this area in the data abstraction research group at IBM's T.J. Watson Research Center will be presented.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Gptips 2

GPTIPS is a free, open source MATLAB based software platform for symbolic data mining (SDM). It uses a ‘multigene’ variant of the biologically inspired machine learning method of genetic programming (MGGP) as the engine that drives the automatic model discovery process. Symbolic data mining is the process of extracting hidden, meaningful relationships from data in the form of symbolic equations...

متن کامل

Elite Bases Regression: A Real-time Algorithm for Symbolic Regression

Symbolic regression is an important but challenging research topic in data mining. It can detect the underlying mathematical models. Genetic programming (GP) is one of the most popular methods for symbolic regression. However, its convergence speed might be too slow for large scale problems with a large number of variables. This drawback has become a bottleneck in practical applications. In thi...

متن کامل

Far beyond the classical data models: symbolic data analysis

This paper introduces symbolic data analysis, explaining how it extends the classical data models to take into account more complete and complex information. Several examples motivate the approach, before the modeling of variables assuming new types of realizations are formally presented. Some methods for the (multivariate) analysis of symbolic data are presented and discussed. This is however ...

متن کامل

Evidence Sets: Modeling Subjective Categories

Zadeh’s Fuzzy Sets are extended with the Dempster-Shafer Theory of Evidence into a new mathematical structure called Evidence Sets, which can capture more efficiently all recognized forms of uncertainty in a formalism that explicitly models the subjective context dependencies of linguistic categories. A belief-based theory of Approximate Reasoning is proposed for these structures. Evidence sets...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998